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Abstract 

Ajtai, Komlos, and Szemeredi proved that for sufficiently large t every triangle- 
free graph with n vertices and average degree t has an independent set of size at 
least yggj log t. We extend this by proving that the number of independent sets in 
such a graph is at least 

2 2450 f lo § 2 *. 

This result is sharp for infinitely many t, n apart from the constant. An easy 
consequence of our result is that there exists c' > such that every n-vertex 
triangle-free graph has at least 

2c' v / nlogn 

independent sets. We conjecture that the exponent above can be improved to 
y/n(\ogn)^l 2 . This would be sharp by the celebrated result of Kim which shows 
that the Ramsey number i?(3, k) has order of magnitude k 2 /\og k. 



1 Introduction 

An independent set in a graph G = (V, E) is a set / C V of vertices such that no two 
vertices in / are adjacent. The independence number of G, denoted a(G), is the size of 
the largest independent set in G. Determining the independence number of a graph is one 
of the most pervasive and fundamental problems in graph theory. The independence 
number naturally arises when studying other fundamental graph parameters like the 
chromatic number (minimum size of a partition of V into independent sets), clique 
number (independence number of the complementary graph), minimum vertex cover 
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(complement of a maximum independent set), matching number (independence number 
in the line graph) and many others. 

Throughout this paper, we suppose that G is a graph with n vertices and average 
degree t. Turan's [11] basic theorem of extremal graph theory, in complementary form, 
states that a(G) > \n/ (t+ 1)] for any graph G. This bound is tight, as demonstrated by 
the complement of Turan's graph G = T(n,r) which, in the case n = kr is the disjoint 
union of r cliques, each with k vertices (then a(G) = r and t — k — 1). Since G contains 
large cliques it is natural to ask whether Turan's bound on a(G) can be improved if we 
prohibit cliques of a prescribed (small) size in G. 

In pQ, Ajtai, Komlos, and Szemeredi showed that if G contains no K3, then this is 
indeed the case, by improving Turan's bound by a factor that is logarithmic in t. More 
precisely, they proved that if G is triangle-free, then 

11 

a(G) > log*. 

v ; ~ lOOt & 

Shortly after, Shearer [10J improved this to a(G) > (1 — o(l))jlogt (assume for con- 
venience throughout this paper that log = log 2 ). Random graphs pU] show that for 
infinitely many t and n with t = t(n) — > 00 as n — > 00, there are n-vertex triangle-free 
graphs with average degree t and independence number (2 — o(l))((n/t) logi). Conse- 
quently, the results of [HUD] cannot be improved apart from the multiplicative constant. 

There is a tight connection between the problem of determining a(G) and questions in 
Ramsey theory. More precisely, determining the minimum possible a(G) for a triangle- 
free G is equivalent to determining the Ramsey number -R(3, k), which is the minimum 
n so that every graph on n vertices contains a triangle or an independent set of size 
k. Moreover, the above lower bounds for a(G) are equivalent to the upper bound 
R(3,t) = 0(t 2 / \ogt). It was a major open problem, dating back to the 1940's, to 
determine the order of magnitude of R(3,t), and this was achieved by Kim [7] who 
showed that for every n sufficiently large, there exists an n-vertex triangle-free graph G 
with a{G) < 9y/n logn. As a consequence, the upper bound R(3,t) = 0(t 2 /\ogt) from 
p] is of the correct order of magnitude. 

In this paper, our goal is to take the result of Ajtai, Komlos, and Szemeredi [1] further 
by not only finding an independent set of the size guaranteed by their result, but by 
showing that many of the vertex subsets of approximately that size are independent 
sets. 

Definition 1. Given a graph G, let i{G) denote the number of independent sets in G. 

Upper bounds for i(G) have been motivated by combinatorial group theory. In [2], 
Alon showed that if G is a <i-regular graph, then i(G) < 2( 1 / 2+ °( (i )) n ; he also conjectured 
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that 

i(G) < (2 d+1 - l) n/2d . 

Kahn [6] proved this conjecture for d-regular bipartite graphs. Galvin [5] obtained a 
similar bound for any <i-regular graph G, namely 



i(G) < 2 n/2{1+1/d+c/d ^/ logd/d) 

for some constant c . Finally, Zhao [12] recently resolved Alon's conjecture. 

In this paper, we consider lower bounds for i(G). This problem is fundamental in 
extremal graph theory, indeed, the Erdos-Stone theorem [4] gives a lower bound for 
i{G) that is the correct order of magnitude provided n/t is a constant. More recently, 
the problem in the range t = O(ra) has been investigated by Razborov [9], Nikiforov [8], 
and Reiher. For example, the results of Razborov and Nikiforov determine g(p,3), the 
minimum triangle density of an n- vertex graph with edge density ~ < p < 1. Looking at 
the complementary graph, this gives tight lower bounds on the number of independent 
sets of size three in a graph with density 1 — p ~ -. 

Lower bounds for i(G) appear not to have been studied with the same intensity when 
t is much smaller than n, in particular, when t — > oo and t/n — > 0. Let us make some 
easy observations that are relevant for our work here. We assume that a := a(G) < n/4. 
Since every subset of an independent set is also independent, Turan's theorem implies 

i(G) >2 a > 2 n/(m) . 

In Section [2} we will improve this to 

i(G) > 2^o t 10 ?*. (1) 

Our proof uses the standard probabilistic argument which establishes the order of mag- 
nitude given by Turan's bound on a(G). This result is certainly not new, and we present 
it only to serve as a warm-up for our main result in Section [31 Let us observe below 
that the result is essentially tight. 

As no subset of size more than a{G) is independent, an easy upper bound on i(G) 
(using a < n/4) is 

a / \ / \ 

« — , I n \ I n \ 

(2) 



i=0 v 7 v 



Since a(T(kr, r)) = r = n/ (t + 1) (recall that n = kr and t = k — 1), this bound implies 
that as n — )■ oo 

i(T(kr,r)) < 2^ < 2(ek) r = 2 l+rlo ^ k = 2^m^\ 



3 



Thus, apart from the constant, the exponent in ([I]) cannot be improved. 

Our main result addresses the case where G contains no triangles. As in the case of 
the independence number, prohibiting triangles improves the bound in (JTJ). 

Theorem 2. (Main Result) Suppose that G is a triangle-free graph on n vertices with 
average degree t, where t is sufficiently large. Then 

t(G)>2^m lo z 2t . (3) 

Suitable modifications of Random graphs provide constructions of n-vertex triangle-free 
graphs G with average degree t = t{n) — > oo as n — > oo, and a{G) = 0((n/t)logt). 
Plugging this into (|2J), we see that Theorem [2] is tight (apart from the constant) for 
infinitely many t. However, it remains open if the theorem is sharp for all t where 
t = n 1/2+o(1) . Indeed, the open problem that remains is to obtain a sharp lower bound 
on i(G) for triangle-free graphs with no restriction on degree. Since all subsets of the 
neighborhood of a vertex of maximum degree are independent, i(G) > 2*. Combining 
this with (j3J) we get 

i{G) > max{2*,2^5T log2 *} > 2 cnl/2l °s n 
for some constant c > 0. We conjecture that this can be improved as follows. 

Conjecture 3. There is an absolute positive constant c such that every n-vertex triangle- 
free graph G satisfies 

i(G) > 2 cnl/2 ( logn ) 3/2 

The conjecture, if true, is sharp (apart from the constant in the exponent) by the 
graphs (due to Kim [7] and more recently Bohman [3]) which show that R(3,t) = 
Q(t 2 1 logt). Indeed, their graphs are triangle-free and have independence number 

a(G) = 0(t) = e(^logt) = 6( v / ^loi^), 

so i(G) < 2°(v^i°g 3/2 «) by ©. 

As mentioned before, throughout the paper, all logarithms are base 2. For a graph 
G, let n(G),e(G) and t(G) denote the number of vertices, edges, and average degree of 
G. 

2 General case 

In this section, we give the simple proof of ([I]). Our purpose in doing this is to familiarize 
the reader with the general approach to the proof of Theorem [2] in the next section. 
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Proposition 4. If G is a graph on n vertices with average degree t, where 2 < t < 
then i(G) > 2^oT lost . 



1 n 
LOO t 

H be the subgraph induced by the k vertices. Then 



Proof. Set k = LioT) tJ ■ Pick a k-set uniformly at random from all /c-sets in V{G). Let 



x.r ,,rM 1 (fcJ) 1 k(k-l) 1 tk 2 

2 m 2 n(n — 1) 2 n 

Recall that Markov's inequality states that if X is a positive random variable and a > 0, 
then Pr[X > a] < E[X)/a; hence Pr[e(H) > 2|^f] < 1/2. So for at least half of the 
choices for H, e(H) < —. Therefore, the number of choices of H for which e{H) < — 
is at least 



If n \ > l(-\<* > 2^ logn/k = 2K lo s n - lo g fc ). (4) 
2 V k J 2 



Now, if e(H) < ^- = i5T)^; then at most -^k of the vertices in H have degree at least 
one. This in turn implies that H contains an independent set / of size at least The 
set I can be obtained from any H which contains it; the number of ways to pick the -^k 
vertices of H — I is at most 

n \ < , 50ne . fc/5Q = 2 Al O g50ne-Alogfc < 2 ^(logn-logfe)+A log 100* _ ^ 

k /50 J k 

Combining this with (Hj) and using f§T of<^<T5offort<g^j, 

i(G) > 2 fc( 5^ )(log "" logfc) ~^ lo s 100 * > 2 fc M lo s 100 *-|j lo g 100 * = 2 fc f§ lo g 100 * > 2aioT lo g* 

□ 



3 Triangle- free graphs 

In this section we prove our main result, Theorem [21 We begin with some modifications 
of a lemma from p] (see the proof of Lemma 4 in pQ ) . 

Lemma 5. (Ajtai-Komlos-Szemeredi [lj) Suppose that G is a triangle-free graph on 
n vertices with average degree t, and let k < n/lOOt. Let H be the subgraph consisting 
of k vertices chosen uniformly at random from all the k-sets contained in {v G V(G) : 
deg(v) < 10t}. Let M be the subgraph of G consisting of vertices adjacent to no vertex 
in H. Let n' and t' denote the number of vertices and average degree of M. Then the 
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random variables H and M satisfy: 



E[n(M) 



E[e(M) 




2 y n-20r 

1 k(k-l) tk 2 

- nt —, \ < — 

2 nyn — 1) n 

2nk{t + \){m + I) 
n-k-20t-2 



(6) 



(7) 



E[e(H) 
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Var [n(M) 



< nt 



(9) 



Var[e(M) 



Var[e(#) 



< 2400H 4 < 40nt 3 

< tk 2 (10k + n) 

~ n 2 



(10) 



(11) 



Further, if e(M) < (1 + <J) E[e(M)] and n(M) > (1 - 5) E[n(M)], then n'/t' > un/t, 
where 8 = 800-^/t/n and u — 1 — 1/t — c W y^t/n for some positive constant c w . 




Remark. Ajtai-Komlos-Szemeredi state their lemma for k = n/lOOt and prove each of 
the first inequalities in ([6]), ([7]), ((9]), and ( ITUj) for all k. They prove each of the second 
inequalities for k = n/lOOt, but it is easily observed that they continue to hold for 
k < n/WOt. 

The next lemma is implied by the computation in the proof of Lemma 4 from [I]. 
However, the last statement of Lemma |6] is crucial to our proof of Theorem |2j so we 
make the computations in [Ij explicit. 

Lemma 6. Suppose G is a triangle-free graph on n > 2 50 vertices with average degree 
t < 2^/nlogn and k < n/lOOt. Then G contains a subgraph H with n(H) = k e(H) < 
k/50. Moreover, if M is the subgraph of G consisting of vertices adjacent to no vertex 
in H, then 

1. n(M) > n(G)/2 and 

2. n(M)/t(M) > un/t, where v = l-l/t- c w ^/i/n. 

Further, if the vertices in H are chosen uniformly at random from all the k-sets contained 
in{v G V(G) : degiy) < 10t}, then at least half of the choices for H satisfy e(H) < k/50, 
along with conditions U\ and [B 
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Proof. Recall that for a random variable X and a > 0, Chebyshev's inequality states 
that Pt[\X - E[X]\ > a] < Var[X]/a 2 . Thus, with a = k/50 - E[e(H)), Lemma 
implies 

Pr e (H) > k 50] < - 

1 V ; ~ 1 J " n 2 {k/50-^) 2 

t(10k + n) 



< 



^(1/50 -g)2 

t(n/10t + n) 
n 2 (l/50 - 1/200) 2 
1/10 + 1 



< 5000- 

n 



n(3/200) 2 
t 
n 

2 loen 

< 5000 — ^~ 

< 1/1000. 

So with probability at most 1/1000, the condition e(H) < k/50 fails. 

Set 5 = 800^/tJn. Again by Lemma |5] and Chebyshev, with a = 8~E[n(M)\, 

Tit 

Pr[n(M) < n/2] < Pr[n(M) < (1 - 5) E[n(M)]] < < 1/1000. 

v 10 / n 

Thus the probability that condition ([1]) fails is at most 1/1000. 
With a = 5E[e(M)], 

40nt 3 

Pr[e(M) >(1 + S) E[e(M)]] < = 1/160. 

10 n 

Since Pr[e(M) > (1 + 5) E[e(M)] or n(M) < n/2] < 1/160 + 1/1000, the last assertion of 
Lemma [5] implies that the probability of condition (j2J) failing is at most 1/160 + 1/1000. 
Therefore, the probability that condition e(H) < k/50 fails or condition ([T]) fails or 
condition © fails is at most 1/1000 + 1/1000 + 1/160 + 1/1000 < 1/2. □ 

Our proof of Theorem [2] is achieved by analyzing Algorithm [1] below. The algorithm 
is a slight modification of the algorithm from [1] that yields an independent set of size 
Y^qj logi. Recall that cio is the constant that appear in Lemma [51 

Theorem 7. Suppose Algorithm^ is run on a triangle-free graph G with n vertices and 
average degree t, where 2 100 <t< y/nlogn andn > (3cio) 12 . If Algorithmic terminates 
at lineUM then \I\ > ~ log 2 1. Otherwise, for each iteration i = 0, R — l, Algorithm 
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Al 


gorithm 1: Independent set algorithm 








Input: Triangle-free graph G with n vertices, 


average 


degree t 




Output: Independent set I 






1 


M = 


= G; 






2 


R = 


L(logt)/2j; 






3 


for i 


^ to i? do 






4 




n 


j = number of vertices in M$; 






5 




t, 


= average degree in M;; 






6 




V 


i = 1 — — Cio\Zti-i/ni-i; 






7 




if % = or Vi > 1 — 1/ log i then 






8 
9 






Apply Lemma [6] with G = Mj and k = 
Mj + i, i/j+i = M, if from Lemma EJ 


1— -I- 

L200 t J ' 
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else 






11 






/ = Independent set in Mj_i of size \n l 


-l/(ti-X 


+ i)l; 


12 






return /; 






13 




end 






14 


end 








15 


H = 


#i U ■ ■ ■ U 






16 


I 


= Independent set in H of size 






17 


return J; 







\J\ successfully applies Lemma\^ to the graph Mi to obtain a graph H i+ i with k = L^fJ 
vertices. Moreover, the graph H in line [73] is the disjoint union of the Hi, and the 
independent set I in line\T^ consists of ^kR vertices from H. 

Proof. We break our proof into two cases, depending on whether Algorithm [T] terminates 
at line C7| or [H 

Line 1171 We need to show that Lemma [6] can be applied at every iteration and that 
the graph H in line [15] contains an independent set of size at least i^kR > ^ ~ logt. 

If % = 0, then k < n/lOOi, t < ^/nlogn, and n> t > 2 50 , so we may apply Lemma 
to obtain graphs Mi and H±, where |V(fTi)| = k. Suppose that % > and that Lemma 
M was successfully applied at all previous iterations. Using [1] of Lemma [6] i < R, and 
R= L(logt)/2j < (logn)/2, 

m > n/T > n/2 R > V™ > Vi > 2 50 . (12) 

By the condition in line [7J z/j > 1 — 1/logi for each iteration i. So by [2] of Lemma El 
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f > ~ t v\V2 ■ ■ ■ Vi > f (1 - l/\ogt) R . Thus 

— > - 1 - l/logt) fl > - 1 - > — . 

In particular, 

1 n 1 rij , . 

fc < < -, 13 

- 200 t ~ 100 U ' V ; 

and also, t < -y/ra log n and rij < n yield 

i logn lognj 

tj < 2rij— < 2rij — — < 2rii — — - = IJni log rij. (14) 
n \/n \/n7 



The inequalities (|T2|) . ( IT3|) . and (|T4|) ensure that we may again apply Lemma |6j with 
Mi, M i+ i, H i+ i, and k playing the roles of G, M, H, and k, respectively. Applying Lemma 
R times yields a collection of sparse graphs Hi,H 2 ,...,H R , each with k vertices. 
Each Hi contains at most ^k vertices of degree at least one, so each Hi contains an 
independent set of size at least By definition of M i; these independent sets may be 
combined into one independent set of size at least ^kR. 

Line 1121 Suppose that the algorithm terminates at line [12] during iteration i+ 1. Then 

'3\2/3 



rii, ti (and n i+ i, t i+i ) have been defined and u i+ i = 1 — 1/tj — Cioa/^Av If U < 
then l/ti > l/log 3 t. Assume U > (|) 2/3 . Then 



2- 



\t- l/Z > l/U. (15) 



By (112]) . rij > A/n, so for ra > (3ci ) 12 , 

n\ > n 3 / 2 > (3c 10 ) 6 n > (3c 10 ) 6 t 4 . 

This implies 



^t 1 / 3 > c 1Q yJti/rii. (16) 



Combining (Fl5]) and ( 1TB]) yields 



1 - *i 1/3 < 1 - 1/U ~ cioJ - = fi+i < 1 - V log! 

V nj 

Thus l/ti > l/log 3 t. Since t > 2 100 (which implies that t > (log 5 t + log 2 1) 2 ) and 
2 < R< ^f, 

rii n 1 n 1 n y/t n 2 

> XT- 5 > — 5 r = — o T > — log t. 



ti + 1 2 i (log 3 t + l) " Vt(log 3 t+l) t(log 3 t + l) t 

Turan's theorem now implies that Mj contains an independent set of size at least > 
ijlog 2 *. ' □ 
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We now complete the proof of Theorem [2] by obtaining a lower bound on the number 
of outcomes given by line [T7] of Algorithm [U 

Proof of Theorem El Recall that we are to show that if G is a triangle-free graph 
on n vertices with average degree t sufficiently large, then i(G) > 22S5oi log2 *. Assume 
t > max{(3ci ) 12 ,2 100 }. Then n > t > (3ci ) 12 . Also, if t > v^logn, then G has 
a vertex whose neighborhood contains at least t > -7= log n > ~ log 2 n vertices. Since 
G is triangle- free, this neighborhood forms an independent set, which contains at least 
2 j log 2 n > 224Bot lo § 2 * subsets, which are also independent. Thus we may assume that 
t < ^/nlogn■, in particular, G satisfies the hypotheses of Theorem [7J 

If the algorithm terminates at line [121 then G contains an independent set of size at 
least ^ log 2 1; so G contains at least 2% log2t > 22^ot log2 * independent sets, and we are 
done. Thus we may assume that Algorithm [T] terminates at line [T7J Consequently, at 
each iteration i, the algorithm applies Lemma [6] to pick a sparse graph with k = LiT^tJ 
vertices. The vertices in this graph are chosen from 

Li = {v e V(Mi) : deg{v) < 10^}. 

Note that 

n(ti = ^ de 9{v) + ^2 deg(v) > ^ deg(v) > (m - 1^1)10^. 

veLi veV(Mi)-Li vdV{Mi)-Li 

This, together with [TJ in Lemma ® implies \L{\ > j^rii > ^rij_i/2 > j^n/2 l . At least 
half of the fc-sets in Lj satify the conditions of Lemma El so the number of choices for 
Hi is at least 

l(\Li\\ l/.9n/2*\ 
2\ k J ~ 2\ k J' 

Therefore, the number of choices for the sequence Hi, ... , Hr is at least 

i=0 v 7 i=0 v 7 

— 2 kR1 °S -9n-kRlogk-kR 2 /2-R 

_ 2kR(logn-logk)-kR 2 /2+kRlog .9-R 

> 2fc-R(logn-logfc)-Mi2|Li_fc_R__R 

> 2kR{\ogn-\ogk)-^f\ogt-2kR fyj\ 

Recall that Algorithm [1] obtains an independent set / of size |~||A;itf| from the graph 
H — Hi U • • • U Hr. For a fixed /, the number of graphs H that yield I is at most the 
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number of possibilities for H — I. This is at most (\ V (h-i)\)^ wm ch is at most 

U ] < / 50rze 2_ kR _ „ ^ fc.R log n+^fc.R log 50e-|j fell log 2fcii 

±kRJ ~ K 2kR > 

^ 2^kR(logn-\ogk)+-^kR\og50e 

< 2^j fcfl ( lo s™ _lo s fc )+ fc - R (18) 

For a fixed if, the number of partitions Hi U • • • U /Jr = H is at most the number of 
partitions of kR elements into R sets of size k, which is less than 

/A;i?\ ^ (R e } kR — 2 kRl °s Re < 2 kRlogR+2kR < 2 kR i R+2kR < 2 kR \ logt+2kR (19) 

Since each H yields an independent set, the total number of independent sets that can 
be returned at line [17] of the algorithm is at least 

^ of ways to obtain H 
(# of H that yield a fixed /)(# of partitions that yield H) 

Since < k < 7^ and R > (logt)/3, ( IT71) . ( fl8j) . and ( 1T9|) imply that this is at least 

2§§fcR(logn-logfc)-§A;mogt-5fc.R > 2^ kRlo S^00t-^kRlogt-5kR 

> 2ii Mlog * 

> 2iiT> fclog2t 

> 22iSoT lo g 2 *. 

□ 
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